System and method for predicting a geographic origin of content and accuracy of geotags related to content obtained from social media and other content providers

ABSTRACT

A system and method for managing geotag data associated with content within a geofeed is provided. The content may be tagged with metadata such as geotag data that may specify a location where the content was created. The generated content may be geotagged by one or more geotag sources including a GPS-enabled device, a user input, a content provider, a user profile, or other sources. The system may determine the geotag data for the content that is not already associated with geotag data. The system may determine a confidence level of the geotag data, whether already geotagged or not. The confidence level may be indicative of a likelihood that the geotag data accurately describes a location where the content was actually created.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 15/241,836, filed Aug. 19, 2016, which is a continuation of granted U.S. patent application Ser. No. 14/512,293, filed Oct. 10, 2014, now U.S. Pat. No. 9,436,690, which is a continuation of granted U.S. patent application Ser. No. 13/843,949, filed Mar. 15, 2013, now U.S. Pat. No. 8,862,589, entitled “SYSTEM AND METHOD FOR PREDICTING A GEOGRAPHIC ORIGIN OF CONTENT AND ACCURACY OF GEOTAGS RELATED TO CONTENT OBTAINED FROM SOCIAL MEDIA AND OTHER CONTENT PROVIDERS”, all of which are hereby incorporated by reference in their entireties.

FIELD OF THE INVENTION

The disclosure relates to systems and methods for predicting where content was created when the content lacks geotag data and predicting the accuracy of geotag data when the content includes geotag data, where the content is obtained from social media and other content providers.

BACKGROUND OF THE INVENTION

The availability of content such as videos, audio files, photos, text, and/or other content over networks such as the Internet has grown at impressive rates. Many Internet and other online service providers make this type of content available to enable users to post and share such content through their services. However, various limitations exist with respect to how this vast amount of information can be effectively organized and/or selectively displayed.

Geotag data (also referred to as a “geotag”), which includes geographic information that indicates where content was created, may provide a basis for such organization. Some geotag data may be automatically generated and associated with the content (e.g., a camera may create photographs that are automatically geotagged with location information) while others may be added after the content is created. Such geotag information may be used to search for and obtain content that was created from particular locations.

However, although geotags may be useful in organizing the large amount of content from social networks and other content providers, much of the social media and other content are not geotagged, either because users have privacy concerns related to geotagging or they are simply not aware of geotagging capabilities. In either case, content that otherwise could be geotagged oftentimes is not.

When content is not geotagged, some content providers may provide geotags based a location specified in a user profile (hereinafter, “profile location”) of a given user. This approach infers that content created by the given user was created at the profile location. However, this is inaccurate in many instances, as the user may create content at locations other than the profile location. Thus, even when content is geotagged, there may be some doubt as to the accuracy of the geographic information indicated by a corresponding geotag.

SUMMARY OF THE INVENTION

The disclosure relates to systems and methods for predicting where content was created when the content lacks geotag data and predicting the accuracy of geotag data when the content includes geotag data, where the content is obtained from social media and other content providers.

In some embodiments, the system may include a computer that facilitates managing geotag data associated with content within a geofeed. The computer may include one or more processors configured to perform some or all of a functionality of a plurality of modules. For example, the one or more processors may be configured to execute a geofeed creation module, a profile-based geo-searching module, a location prediction module, a location accuracy determination module, a communication module, a user interface module, and/or other modules.

The geofeed creation module may be configured to receive a request to create a geofeed based on a specification of one or more geo-locations (hereinafter, a “geo-location specification”) and/or retrieve a previously requested geofeed (hereinafter, “geofeed request”). The request may include one or more geofeed parameters such as, for example, content providers to include (or exclude), types of content to include (or exclude), date ranges, content matching patterns, keywords, and/or other parameters that instruct the system as to which content should be included in the geofeed. As such, the one or more geofeed parameters may be used to filter content into the geofeed and/or out of the geofeed.

The geofeed creation module may generate a geofeed definition that includes the specification of the one or more geo-locations, the one or more geofeed parameters, and/or other information related to the geofeed. The geofeed definition may be updated.

To create the geofeed, the geofeed creation module may obtain the geo-location specification from the geofeed definition and generate requests that specify the one or more geo-locations specifically for individual ones of a plurality of content providers.

The profile-based geo-searching module may be configured to use a profile location of a user who created particular content to estimate the location of the content. For example, profile-based geo-searching module may obtain a profile location of a particular user from a content provider and determine that content that is created by the particular user was created at the profile location. In this manner, content items that are not geotagged (also referred to as “non-geotagged content”) may still be found to satisfy a geo-location specification of a geofeed definition based on the profile location associated with the non-geotagged content.

In some embodiments, non-geotagged content may be obtained from content providers, such as when a geofeed definition does not specify a geo-location. In these instances, the location prediction module may be configured to predict where content was created. In other words, the location prediction module may determine geotag data for non-geotagged content.

In some embodiments, the location prediction module may predict where content was created using one or more techniques or combination of techniques. For example, the location prediction module may perform recognition techniques such as text recognition, image recognition, speech recognition, context-based recognition (e.g., recognizing buildings, signage, other indication of a particular landmark or place, and/or other geographical features in non-geotagged content and/or its associated metadata), and/or other recognition techniques that can be used to recognize location-identifying features within or related to the non-geotagged content.

In some embodiments, the location prediction module may be configured to automatically crawl hyperlinks included in the content, automatically correlate non-geotagged content and geo-tagged content (e.g., determining a correlation between non-geotagged content and geotagged content and/or tagging the non-geotagged content with geotag data associated with the geotagged content), and/or make other automatic determinations to predict where non-geotagged content was created.

The location accuracy determination module may be configured to predict, estimate, and/or otherwise determine the accuracy of a geotag assigned to and/or associated with content. For example, location accuracy determination module may calculate or otherwise determine a geotag confidence level that indicates a likelihood (which may be expressed as a probability) that the geotag accurately indicates a location from which the content was created. The geotag confidence level may also be indicative of the likelihood that the content associated with the particular geotag was actually created within the geo-location specified in the geofeed request.

In some embodiments, the location accuracy determination module may calculate or otherwise determine the confidence level based on a source from which the geotag information originated (“geotag sources”). Geotag sources may include, for example, a GPS-enabled device, different types of location sensors used, a user input, a content provider, a user profile, the location prediction module, and/or other sources of location information. Some geotag sources may be deemed more trustworthy than others. Thus, the location accuracy determination module may assign a higher or lower confidence levels to geotags depending on its geotag source. For example, a first geotag that is based on a GPS-enabled device may be deemed to be more accurate than a second geotag that is based on the location profile. As such, the location accuracy determination module may assign a higher confidence level for the first geotag than for the second geotag.

Location accuracy determination module may use a combination of geotag sources to determine the confidence level as well. For example, if both the first geotag and the second geotag in the foregoing example indicate the same or similar location for a given content, then the confidence level for a location prediction for the content item may be higher than if only one was available or they do not match with one another.

The communication module may be configured to communicate the geofeed comprising the content that is associated with geotags obtained from one or more geotag sources. The geofeed may be communicated to the content consumer via the user interface communicated via the user interface module and/or other communication channel.

The user interface module may be configured to generate a user interface that displays the content within a geofeed along with a confidence level indicating the accuracy of the geotag associated with the content and/or a corresponding geotag source. In some embodiments, the user interface module may display a scrollbar, a text input box, and/or other input fields that can receive a user input for a threshold confidence level. A user may via an input field indicate a threshold confidence level and the user interface module may highlight or otherwise differentially display content with a confidence level that is higher than the threshold value from content with a confidence level that is lower than the threshold value.

Various other objects, features, and advantages of the invention will be apparent through the detailed description of the preferred embodiments and the drawings attached hereto. It is also to be understood that both the foregoing general description and the following detailed description are exemplary and not restrictive of the scope of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a system of managing geotag data associated with content within a geofeed, according to an aspect of the invention.

FIG. 2 illustrates a process for managing geotag data associated with content within a geofeed and determining a geotag confidence level associated with the geotag data, according to an aspect of the invention.

FIG. 3 illustrates a process for managing geotag data associated with content within a geofeed and determining a geotag confidence level associated with the geotag data based on a geotag source related to the geotag data, according to an aspect of the invention.

FIG. 4 illustrates a process for managing geotag data associated with content within a geofeed and determining a geotag confidence level associated with the geotag data where the geotag data has been obtained based on a user profile of a content creator who created the content, according to an aspect of the invention.

FIG. 5 illustrates a screenshot of an interface for communicating a geofeed comprising content associated with geotag data obtained from one or more geotag sources, according to an aspect of the invention, according to an aspect of the invention.

FIG. 6 illustrates a screenshot of an interface for communicating a geofeed comprising content associated with geotag data that is obtained based on a user profile of a content creator who created the content, according to an aspect of the invention.

FIG. 7 illustrates a screenshot of an interface for communicating content within a geofeed based on geotag data confidence level selected by a user, according to an aspect of the invention.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 illustrates a system 100 of managing geotag data associated with content in a geofeed, according to an aspect of the invention. A geofeed includes a collection of content, aggregated from various content providers, that is relevant to a geographically definable location (hereinafter, a “geo-location”). The aggregated content (also referred to herein as “geofeed content”) may include, for example, video, audio, images, text, hyperlinks, and/or other content that may be relevant to a geo-location.

The content providers may include, for example, social media platforms (e.g., FACEBOOK, TWITTER, INSTAGRAM, FLICKR, etc.), online knowledge databases, and/or other providers that can distribute content that may be relevant to a geo-location. The geo-location may be specified by a boundary, geo coordinates (e.g., latitude, longitude, altitude/depth), an address, a school, a place name, a point of interest, a zip code, a city, a state, a country, and/or other information that can spatially identify an area. Social media platforms may, for example, register users so that the users may post content, which is distributed by the social media platform to other users. The content may be generated by content sources such as individuals, corporations, and/or other entities that may create content. As used hereinafter, “a location,” “a geo-location,” “a geographically definable location,” and similar language is not limited to a single location but may also refer to one or more such locations.

System 100 may include a computer 110, a geofeed API 122, a content consumer device 160, provider APIs 140, content providers 150, and/or other components. In some implementations, computer 110 may include one or more processors 120 configured to perform some or all of a functionality of a plurality of modules, which may be stored in a memory 121. For example, the one or more processors may be configured to execute a geofeed creation module 111, a profile-based geo-searching module 112, a location prediction module 113, a location accuracy determination module 114, a communication module 115, a user interface module 116, and/or other modules 119.

Geofeed creation module 111 may be configured to create one or more geofeeds 101 (illustrated in FIG. 1 as geofeed 101A, 101B, . . . , 101N), as described in U.S. patent application Ser. No. 13/284,455, filed Oct. 28, 2011, entitled “SYSTEM AND METHOD FOR AGGREGATING AND DISTRIBUTING GEOTAGGED CONTENT,” and U.S. patent application Ser. No. 13/619,888, filed Sep. 14, 2012, entitled “SYSTEM AND METHOD FOR GENERATING, ACCESSING, AND UPDATING GEOFEEDS” both of which are incorporated by reference herein in their entirety.

U.S. patent application Ser. No. 13/708,516, filed Dec. 7, 2012, entitled “SYSTEM AND METHOD FOR LOCATION MONITORING BASED ON ORGANIZED GEOFEEDS,” U.S. patent application Ser. No. 13/708,466, filed Dec. 7, 2012, entitled “SYSTEM AND METHOD FOR GENERATING AND MANAGING GEOFEED-BASED ALERTS,” U.S. patent application Ser. No. 13/708,404, filed Dec. 7, 2012, entitled “SYSTEM AND METHOD FOR RANKING GEOFEEDS AND CONTENT WITHIN GEOFEEDS,” U.S. patent application Ser. No. 13/788,843, filed Mar. 7, 2013, entitled “SYSTEM AND METHOD FOR DIFFERENTIALLY PROCESSING A LOCATION INPUT FOR CONTENT PROVIDERS THAT USE DIFFERENT LOCATION INPUT FORMATS,” U.S. patent application Ser. No. 13/788,760, filed Mar. 7, 2013, entitled “SYSTEM AND METHOD FOR CREATING AND MANAGING GEOFEEDS,” and U.S. patent application Ser. No. 13/788,909, filed Mar. 7, 2013, entitled “SYSTEM AND METHOD FOR TARGETED MESSAGING, WORKFLOW MANAGEMENT, AND DIGITAL RIGHTS MANAGEMENT FOR GEOFEEDS,” are all incorporated by reference in their entireties herein. are all incorporated by reference in their entireties herein.

U.S. patent application Ser. No. 13/843,832, filed Mar. 15, 2013, entitled “SYSTEM AND METHOD FOR GENERATING THREE-DIMENSIONAL GEOFEEDS, ORIENTATION-BASED GEOFEEDS, AND GEOFEEDS BASED ON AMBIENT CONDITIONS BASED ON CONTENT PROVIDED BY SOCIAL MEDIA CONTENT PROVIDERS,” and U.S. Patent Application Ser. No. 61/800,951, filed Mar. 15, 2013, entitled “VIEW OF A PHYSICAL SPACE AUGMENTED WITH SOCIAL MEDIA CONTENT ORIGINATING FROM A GEO-LOCATION OF THE PHYSICAL SPACE,” are all incorporated by reference in their entireties herein.

For example, geofeed creation module 111 may be configured to receive a request to create a geofeed and/or retrieve a previously requested geofeed (hereinafter, “geofeed request”). The request may include a specification of one or more geo-locations (hereinafter, a “geo-location specification”) and one or more geofeed parameters such as, for example, providers to include (or exclude), types of content to include (or exclude), date ranges, content matching patterns, keywords, and/or other parameters that instruct the system as to which content should be included in the geofeed.

Geofeed creation module 111 may format a geofeed request specific for different provider APIs 140 (illustrated in FIG. 1 as API 140A, 140B, . . . , 140N). The provider APIs may facilitate receiving content from corresponding content providers 150 (illustrated in FIG. 1 as content providers 150A, 150B, 150C). In some implementations, geofeed creation module 111 may format a request directly for content provider 150N without using a corresponding API. Formatting instructions may be stored at a provider profile 134. A content consumer device 160 may request and view geofeeds 101 created by geofeed creation module 111.

Content provided by content providers 150 in response to a geofeed request may already have geotag data embedded in the content. For example, the content may have been automatically geotagged (e.g., by a GPS-enabled device), geotagged based on a user input (e.g., the content creator manually inputting a geo-location when creating a social media post), geotagged by a content provider (e.g., the content provider creating geotag data using various techniques as apparent to those of ordinary skill in the art), geotagged using a location specified in a user profile (hereinafter, “profile location”) of the content creator who created the particular content, and/or geotagged with other types of geotag data (hereinafter, “geotagged content”).

A user profile of a content creator often contains geographic location information such as a “home” location of the content creator since many content providers require content creators (e.g., social media users) to indicate their “home” locations in their profiles when they initially sign up for social media services. As such, in some embodiments, a particular content provider may use a profile location associated with the content creator to define a location of where the content was created and/or geotag the content with the profile location.

In other embodiments, when a content consumer requests content from a content provider which does not generate, obtain, and/or manage geotags based on content creators' profile locations, profile-based geo-searching module 112 may be configured to identify a profile location associated with the content and/or use the profile location to determine whether the content is related to one or more geo-locations specified in the geofeed request. This capability to search for content based on the profile location may be enabled or disabled. When it is enabled, profile-based geo-searching module 112 may identify a profile location associated with content by accessing the user profile of the content creator who created the particular content and/or create a geotag for the content using the profile location. For example, when it is determined that the content is not being associated with any geotags (e.g., the content was not automatically geotagged by a GPS-enabled device and/or geotagged based on a user input), profile-based geo-searching module 112 may identify a profile location and compare it with the one or more geo-locations specified in the geofeed request to determine whether the content is related to the one or more geo-locations.

In some embodiments, content provided by content providers 150 in response to a geofeed request may include content that is not already associated with a geotag when the content is acquired by geofeed creation module 111 (hereinafter, “non-geotagged content”). For example, non-geotagged content may be deemed responsive to the geofeed request when other tags/metadata (e.g., keywords, creation time/date, etc.) associated with the non-geotagged content match the criteria defined by one or more geofeed parameters (e.g., keywords, date ranges, etc.) of the geofeed request.

Location prediction module 113 may be configured to determine geotag data for non-geotagged content that is not already associated with a geotag when the content is acquired by geofeed creation module 111 and/or to link the determined geotag data to the content. Location prediction module 113 may determine the geotag data that would most accurately describes a location where the content was created using various geotag recognition techniques including text, image, or speech recognition, context-based recognition, automated crawling of hyperlinks, automated correlation between non-geotagged content and geo-tagged content, or any combination thereof.

In some embodiments, location prediction module 113 may determine geotag data by recognizing buildings, signage, other indication of a particular landmark or place, and/or other geographical features in the non-geotagged content. For example, non-geotagged content may comprise a status update (e.g. FACEBOOK “status update”) referring to the Empire State Building: “I am at the Empire State Building.” Location prediction module 113 may recognize the word “the Empire State Building” as an indication of geo-location and/or retrieve a geo-location (e.g., geographic coordinates) that corresponds to the known location of the Empire State Building. In another example, non-geotagged content may comprise a photograph that shows the White House. Location prediction module 113 may utilize an image recognition technique to recognize that the photograph includes an image of the White House and the photograph may be geotagged with the geographic coordinates of the White House.

In some embodiments, the non-geotagged content may be automatically tagged with geotag data by analyzing hyperlinks included in the content. For example, the non-geotagged content may include a hyperlink that links and/or points to geotagged content. A geotag may be retrieved from the geotagged content and/or automatically applied to the non-geotagged content or it may be configured to be applied only if certain conditions have been satisfied. For instance, whether the non-geotagged content should adopt the geotag may be determined based on a temporal proximity of the creation date/time of the geotagged content to the creation date/time of the non-geotagged content. Such a condition may be necessary to ensure that the geotag data that is applied to the non-geotagged content accurately describes the location where the non-geotagged content was created. For example, a content creator may create a social media post that includes a hyperlink that links and/or points to a geotagged image of the place where he/she is at (e.g., “I am here right now, http:// . . . ”) soon after the image was created. In this case, there is a higher chance that the geotag data associated with the image accurately describes the location where the social media post was created. On the other hand, the content creator may create a social media post about the geotagged image after a predefined time has elapsed since the image was created (e.g., “I was here yesterday, http:// . . . ”). If a predefined time has elapsed since the geotagged image was created, the geotag data associated with the image may no longer be relevant to the location where a subsequent post about the image is eventually created.

Content may contain any number of other types of metadata (e.g., tags) other than a geotag such as a title, comment, description, identification of a content creator, creation/modification date/time, content type, etc. In some embodiments, location prediction module 113 may search through such metadata associated with non-geotagged content to identify metadata that may provide an indication of geo-location. For example, non-geotagged content may comprise a photograph that may be titled “the Empire State Building,” in which case location prediction module 113 may tag the content with the geo-location associated with the Empire State Building. In other embodiments, location prediction module 113 may utilize metadata associated with other content that the non-geotagged content has a relationship with. For example, the non-geotagged content and the other content may have a hierarchical relationship with each other. A photograph may belong to a photo album that is titled “Red Sox game.” Location prediction module 113 may recognize this title of the photo album as an indication of geo-location (e.g., the Fenway Park, Boston, Mass.) and tag the photograph within the album with that geo-location.

In some embodiments, geographic location information of a place/location may be obtained and/or retrieved from an external database and/or other database storage that may store and/or maintain geographic information related to popular landmarks, places, etc.

In some embodiments, location prediction module 113 may be configured to make an automatic correlation between non-geotagged content and geotagged content. The geotagged content may include geotagged content that was obtained by geofeed creation module 111 in response to the current geofeed request, geotagged content that was previously obtained by geofeed creation module 111 based on past geofeed requests, content in which geotag data have been embedded by location prediction module 112, and/or any other geotagged content. Previously obtained geotagged content may be retrieved from a geofeed database 132, for example. Once a correlation is identified between the non-geotagged content and geotagged content, location prediction module 113 may tag the non-geotagged content with a geotag associated with the geotagged content.

In some embodiments, location prediction module 113 may also identify a correlation between non-geotagged content and geotagged content based on similarity in content and/or similarity between metadata (e.g., creation/modification date/time, identification of a content creator, etc.) associated with the non-geotagged content and metadata associated with the geotagged content. In one example, if the non-geotagged image contains an indication of geo-location (e.g., building, signage, etc.) that is similar to the image data of geotagged content, location prediction module 113 may determine that a correlation exists between them and/or tag the non-geotagged content with a geotag included in the geotagged content. In another example, location prediction module 113 may correlate non-geotagged content with geotagged content at least in part on the basis of similarity between metadata associated with the non-geotagged content and metadata associated with the geotagged content. For example, if the same content creator created both the non-geotagged content and the geotagged content around the same time (e.g., the content creator created the non-geotagged content within few minutes before/after the creation time of the geotagged content), location prediction module 113 may determine that a correlation exists between them and/or tag the non-geotagged content with a geotag included in the geotagged content.

Location accuracy determination module 114 may be configured to predict, estimate, and/or otherwise determine the accuracy of a geotag assigned and/or associated with content by calculating or otherwise determining a confidence level of location information defined in the geotag (“geotag confidence level”). For example, the geotag confidence level may indicate a likelihood (and/or probability) that the geotag accurately describes and/or matches a location from which the content was created. This confidence level may also be indicative of likelihood that the content associated with the particular geotag was actually created within the geo-location specified in the geofeed request.

In some embodiments, location accuracy determination module 114 may calculate or otherwise determine the confidence level by identifying a source from which the geotag originated (“geotag sources”). Geotag sources may include, for example, a GPS-enabled device (e.g., smartphone), a user input (e.g., the content creator manually inputting a geo-location when creating a social media post), a content provider (e.g., the content provider creating geotag data using various techniques as apparent to those of ordinary skill in the art), a user profile (e.g., the “home” location of the content creator), and/or location prediction module 113 (e.g., geotagged by crawling hyperlinks within content, automatic correlation, and/or other ways as discussed herein with respect to location prediction module 113). For example, geotags provided by geotag sources such as a GPS-enabled device and/or user input may be more likely to be accurate than the ones provided by a user profile and/or location prediction module 113, and thus may be assigned a higher confidence level.

In some embodiments, depending on the type of geotag source, a geotag confidence level may have a different predetermined and/or predefined confidence level value. For example, a geotag provided by a GPS-enabled device may be assigned a predetermined confidence level (e.g., 95%) that may be different from a predetermined confidence level (e.g., 75%) that may be assigned to a geotag provided by a user profile.

In some embodiment, for a profile-based geofeed result (e.g., having content associated with a profile location that is within a geo-location specified in a geofeed request), location accuracy determination module 114 may be configured to determine a geotag confidence level for the profile-based geofeed result based on a statistical analysis on a large quantity of geotagged content that has been geotagged by a geotag source other than a user profile (“non-profile-based geotagged content”) (e.g., content that is geotagged with geographical coordinates) where the non-profile-based geotagged content was created within a geo-location specified in a geofeed request. In doing so, location accuracy determination module 114 may obtain from content providers a large quantity of non-profile-based geotagged content that are located within the specified geo-location from which the profile-based geofeed result also originated. Location accuracy determination module 114 may identify and/or obtain a profile location associated with individual non-profile-based geotagged content based on the user profile of the content creator who created the particular content. This profile location may be compared to the geographical coordinates (or other geo-location information) defined in the geotag of the individual non-profile-based geotagged content to determine whether the profile location matches with and/or correspond to the geotag data. A geotag location and a profile location may match and/or correspond to each other when they are directed to a location that is common to each other (e.g., two locations having an overlapping area).

In some embodiments, location accuracy determination module 114 may calculate, based on the comparison, a percentage of non-profile-based geotagged content whose geotag location match with and/or correspond to its profile location in a total quantity of non-profile-based geotagged content obtained by location accuracy determination module 114. This percentage value may indicate a confidence level that the profile-based geofeed result actually originated from the specified geo-location. In other words, it may indicate a degree of likelihood that the profile location associated with the profile-based geofeed result accurately corresponds to the location where the content related to the profile-based geofeed result was actually created.

In some embodiments, in calculating this percentage value, location accuracy determination module 114 may count the number of non-profile-based geotagged content whose geotag location (e.g., geographical coordinates) match with and/or correspond to its profile location and divide this counted number by the total number of non-profile-based geotagged content obtained by location accuracy determination module 114. As such, the higher this counted number is, the higher the confidence level determined for the profile-based geofeed result will be.

Communication module 115 may be configured to communicate the geofeed comprising the content that is associated with geotags obtained from one or more geotag sources. The geofeed may be communicated to the content consumer via the user interface communicated via user interface module 116 and/or other communication channel.

User interface module 116 may be configured to generate a user interface that displays the content within a geofeed along with a confidence level indicating the accuracy of the geotag associated with the content and/or a corresponding geotag source. In some embodiments, user interface module 116 may display a scrollbar, a text input box, and/or other input fields that can receive a user input for a threshold confidence level. A user may via an input field indicate a threshold confidence level and user interface module 116 may highlight or otherwise differentially display content with a confidence level that is higher than the threshold value from content with a confidence level that is lower than the threshold value.

Exemplary screenshots of interfaces generated by user interface module 116 are illustrated in FIGS. 5-7.

Those having skill in the art will recognize that computer 110 and content consumer device 160 may each comprise one or more processors, one or more interfaces (to various peripheral devices or components), memory, one or more storage devices, and/or other components coupled via a bus. The memory may comprise random access memory (RAM), read only memory (ROM), or other memory. The memory may store computer-executable instructions to be executed by the processor as well as data that may be manipulated by the processor. The storage devices may comprise floppy disks, hard disks, optical disks, tapes, or other storage devices for storing computer-executable instructions and/or data.

One or more applications, including various modules, may be loaded into memory and run on an operating system of computer 110 and/or consumer device 160. In one implementation, computer 110 and consumer device 160 may each comprise a server device, a desktop computer, a laptop, a cell phone, a smart phone, a Personal Digital Assistant, a pocket PC, or other device.

Network 102 may include any one or more of, for instance, the Internet, an intranet, a PAN (Personal Area Network), a LAN (Local Area Network), a WAN (Wide Area Network), a SAN (Storage Area Network), a MAN (Metropolitan Area Network), a wireless network, a cellular communications network, a Public Switched Telephone Network, and/or other network.

FIG. 2 illustrates a process 200 for managing geotag data associated with content within a geofeed and determining a geotag confidence level associated with the geotag data, according to an aspect of the invention. The various processing operations and/or data flows depicted in FIG. 2 (and in the other drawing figures) are described in greater detail herein. The described operations may be accomplished using some or all of the system components described in detail above and, in some implementations, various operations may be performed in different sequences and various operations may be omitted. Additional operations may be performed along with some or all of the operations shown in the depicted flow diagrams. One or more operations may be performed simultaneously. Accordingly, the operations as illustrated (and described in greater detail below) are exemplary by nature and, as such, should not be viewed as limiting.

In an operation 202, process 200 may include obtaining content associated with a geofeed. The geofeed may be created and/or retrieved by computer 110.

In an operation 204, process 200 may include determining whether metadata associated with the obtained content includes geotag data. If process 200 determines that the metadata associated with the content includes geotag data, process 200 may include determining a geotag confidence level associated with the geotag data in an operation 206. On the other hand, if the metadata associated with the content does not include geotag data, process 200 may include determining geotag data for the content in an operation 208.

In operation 208, process 200 may include determining the geotag data to be associated with the content in response to determining that the metadata associated with the content does not include the geotag data. Process 200 may determine the geotag data that would most accurately describes a location where the content was created using various geotag recognition techniques including text, image, or speech recognition, context-based recognition, automated crawling of hyperlinks, automated correlation between non-geotagged content and geo-tagged content, or any combination thereof. For example, process 200 may identify a hyperlink in the content where the hyperlink points and/or links to other geotagged content whose metadata includes geotag data. Process 200 may retrieve the geotag data from the geotagged content and/or apply to the content.

In operation 206, process 200 may include determining a geotag confidence level associated with the geotag data where the geotag data may have been already embedded in the content obtained in operation 202 or determined using one or more geotag recognition techniques in operation 208. Process 200 may include predicting, estimating, and/or otherwise determining the accuracy of geotag data assigned and/or associated with content by calculating or otherwise determining a confidence level of location information defined in the geotag data (“geotag confidence level”). For example, the geotag confidence level may indicate a likelihood (and/or probability) that the geotag data accurately describes and/or matches a location from which the content was created. This confidence level may also be indicative of likelihood that the content associated with the particular geotag was actually created within the geo-location specified in the geofeed request.

FIG. 3 illustrates a process 300 for managing geotag data associated with content within a geofeed and determining a geotag confidence level associated with the geotag data based on a geotag source related to the geotag data, according to an aspect of the invention.

In an operation 302, process 300 may include obtaining content associated with a geofeed. The geofeed may be created and/or retrieved by computer 110.

In an operation 304, process 300 may include determining whether metadata associated with the obtained content includes geotag data. If process 300 determines that the metadata associated with the content includes geotag data, process 300 may include identifying a geotag source related to the geotag data in an operation 306. On the other hand, if the metadata associated with the content does not include geotag data, process 300 may include determining geotag data for the content in an operation 308.

In operation 308, process 300 may include determining the geotag data to be associated with the content in response to determining that the metadata associated with the content does not include the geotag data. Process 300 may determine the geotag data that would most accurately describes a location where the content was created using various geotag recognition techniques including text, image, or speech recognition, context-based recognition, automated crawling of hyperlinks, automated correlation between non-geotagged content and geo-tagged content, or any combination thereof. For example, process 300 may identify a hyperlink in the content where the hyperlink points and/or links to other geotagged content whose metadata includes geotag data. Process 300 may retrieve the geotag data from the geotagged content and/or apply to the content.

In operation 306, process 300 may include identifying a source from which the geotag data originated (“geotag sources”). Geotag sources may include, for example, a GPS-enabled device (e.g., smartphone), a user input (e.g., the content creator manually inputting a geo-location when creating a social media post), a content provider (e.g., the content provider creating geotag data using various techniques as apparent to those of ordinary skill in the art), a user profile (e.g., the “home” location of the content creator), and/or location prediction module 113 (e.g., geotagged by crawling hyperlinks within content, automatic correlation, and/or other ways as discussed herein with respect to location prediction module 113). For example, the geotag data determined in operation 308 may be associated with a geotag source such as location prediction module 113 whereas the geotag data already embedded in the content obtained in operation 302 may be associated with a geotag source such as a GPS-enabled device, a user input, a content provider, a user profile, etc.

In an operation 310, process 300 may include determining a geotag confidence level associated with the geotag data based on the geotag source identified in operation 306. For example, geotags provided by geotag sources such as a GPS-enabled device and/or user input may be more likely to be accurate than the ones provided by a user profile and/or location prediction module 113, and thus may be assigned a higher confidence level.

FIG. 4 illustrates a process 400 for managing geotag data associated with content within a geofeed and determining a geotag confidence level associated with the geotag data where the geotag data has been obtained based on a user profile of a content creator who created the content, according to an aspect of the invention.

In an operation 402, process 400 may include obtaining content associated with a geofeed. The geofeed may be created and/or retrieved by computer 110.

In an operation 404, process 400 may include determining whether metadata associated with the obtained content includes geotag data. If process 400 determines that the metadata associated with the content includes geotag data, process 400 may include identifying a geotag source related to the geotag data in an operation 406. On the other hand, if the metadata associated with the content does not include geotag data, process 400 may include determining geotag data for the content in an operation 408.

In operation 408, process 400 may include determining the geotag data to be associated with the content in response to determining that the metadata associated with the content does not include the geotag data. Process 400 may determine the geotag data that would most accurately describes a location where the content was created using various geotag recognition techniques including text, image, or speech recognition, context-based recognition, automated crawling of hyperlinks, automated correlation between non-geotagged content and geo-tagged content, or any combination thereof. For example, process 400 may identify a hyperlink in the content where the hyperlink points and/or links to other geotagged content whose metadata includes geotag data. Process 400 may retrieve the geotag data from the geotagged content and/or apply to the content.

In operation 406, process 400 may include identifying a source from which the geotag data originated (“geotag sources”). Geotag sources may include, for example, a GPS-enabled device (e.g., smartphone), a user input (e.g., the content creator manually inputting a geo-location when creating a social media post), a content provider (e.g., the content provider creating geotag data using various techniques as apparent to those of ordinary skill in the art), a user profile (e.g., the “home” location of the content creator), and/or location prediction module 113 (e.g., geotagged by crawling hyperlinks within content, automatic correlation, and/or other ways as discussed herein with respect to location prediction module 113). For example, the geotag data determined in operation 408 may be associated with a geotag source such as location prediction module 113 whereas the geotag data already embedded in the content obtained in operation 402 may be associated with a geotag source such as a GPS-enabled device, a user input, a content provider, a user profile, etc.

In an operation 410, process 400 may include determining whether the geotag source identified in operation 406 is a user profile. If process 400 determines that the geotag source identified in operation 406 is not a user profile, process 400 may proceed to an operation 418. On the other hand, if the geotag source identified in operation 406 is a user profile, process 400 may include obtaining from content providers a large quantity of non-profile-based geotagged content that are located within a geo-location specified in the geofeed request in an operation 412.

In an operation 414, process 400 may include identifying and/or obtaining a profile location associated with individual non-profile-based geotagged content based on the user profile of the content creator who created the particular content.

In an operation 416, process 400 may include determining whether the geotag data and the profile location of the non-profile-based geotagged content match. In doing so, the profile location may be compared to the geographical coordinates (or other geo-location information) defined in the geotag data of the individual non-profile-based geotagged content to determine whether the profile location matches with and/or correspond to the geotag data. For example, a geotag location and a profile location may match and/or correspond to each other when they are directed to a location that is common to each other (e.g., two locations having an overlapping area).

In an operation 418, process 400 may include determining a geotag confidence level associated with the geotag data of the content (“profile-based geofeed result”) wherein the geotag data has been generated based on the user profile of the content creator who created the content. In doing so, process 400 may include calculating a percentage of non-profile-based geotagged content whose geotag location match with and/or correspond to its profile location in a total quantity of non-profile-based geotagged content obtained in operation 412. This percentage value may indicate a confidence level that the profile-based geofeed result actually originated from the specified geo-location. In other words, it may indicate a degree of likelihood that the profile location associated with the profile-based geofeed result accurately corresponds to the location where the content related to the profile-based geofeed result was actually created.

FIG. 5 illustrates a screenshot of an interface 500 for communicating a geofeed comprising content associated with geotag data obtained from one or more geotag sources, according to an aspect of the invention. The screenshots illustrated in FIG. 5 and other drawing figures are for illustrative purposes only. Various components may be added, deleted, moved, or otherwise changed so that the configuration, appearance, and/or content of the screenshots may be different than as illustrated in the figures. Accordingly, the graphical user interface objects as illustrated (and described in greater detail below) are exemplary by nature and, as such, should not be viewed as limiting.

Interface 500 and other interfaces described herein may be implemented as a web page communicated from computer 110 to a client, an application such as a mobile application executing on the client that receives generates the interface based on information communicated from computer 110, and/or other interface. Whichever type of interface is used, computer 110 may communicate the data and/or formatting instructions related to the interface to the client, causing the client to generate the various interfaces of FIG. 5 and other drawing figures. Furthermore, computer 110 may receive data from the client via the various interfaces, as would be appreciated.

Referring to FIG. 5, interface 500 may include a geofeed bounded by a geo-location defined by polygon 501. The geofeed may include content items 510-515 whose locations reside within the geo-location defined by polygon 501. Individual content items 510-515 may be associated with a geotag that has been obtained from one or more geotag sources. Geotag sources may include, for example, a GPS-enabled device (e.g., smartphone), a user input (e.g., the content creator manually inputting a geo-location when creating a social media post), a content provider (e.g., the content provider creating geotag data using various techniques as apparent to those of ordinary skill in the art), a user profile (e.g., the “home” location of the content creator), and/or location prediction module 113 (e.g., geotagged by crawling hyperlinks within content, automatic correlation, and/or other ways as discussed herein with respect to location prediction module 113).

As illustrated, when content item 510 is selected (e.g., moused over, clicked, touched, or otherwise interacted with), interface 500 may cause geotag detail element 510A to appear. Geotag detail element 510A may include information related to the geotag associated with content item 510 such as, for example, a geotag source (e.g., GPS-enabled device) that provided the geotag and/or a confidence level associated with the geotag. The confidence level may indicate a likelihood (and/or probability) that the geotag accurately describes and/or matches a location from which content item 510 was created. This confidence level may also be indicative of likelihood that content item 510 associated with the particular geotag was actually created within the geo-location defined by polygon 501.

FIG. 6 illustrates a screenshot of an interface 600 for communicating a geofeed comprising content associated with geotag data that is obtained based on a user profile of a content creator who created the content, according to an aspect of the invention.

Interface 600 may include a geofeed bounded by a geo-location defined by polygon 601. The geofeed may include profile-based geofeed result 610 which has been geotagged based on a profile location included in a user profile of a content creator who created the content in profile-based geofeed result 610.

Computer 110 may determine a geotag confidence level indicative of a likelihood that the profile location associated with profile-based geofeed result 610 accurately corresponds to the location where the content related to profile-based geofeed result 610 was actually created based on a statistical analysis on a large quantity of geotagged content that has been geotagged by a geotag source other than a user profile (“non-profile-based geotagged content”) (e.g., content that is geotagged with geographical coordinates) where the non-profile-based geotagged content was created within the geo-location specified for the geofeed. For example, interface 600 may include non-profile-based geotagged content items 611-613 whose geotag locations are within the geo-location defined by polygon 601.

Computer 110 may identify and/or obtain a profile location associated with individual non-profile-based geotagged content items 611-613 by accessing the user profile of the content creator who created the particular content. This profile location may be compared to the geographical coordinates (or other geo-location information) defined in the geotag of the individual non-profile-based geotagged content items 611-613 to determine whether the profile location matches or otherwise correspond to the geotag data. As illustrated, non-profile-based geotagged content item 611 may be associated with a geotag which comprises latitude and longitude coordinates of (39.098, −94.519). Computer 110 may look up a user profile of the content creator who created content item 611 to determine a profile location which may be a “home” location (e.g., Kansas City, Mo.) for the content creator. Computer 110 may compare the geographic coordinates of the geotag and the profile location to determine whether the geotag location and the profile location match and/or correspond to each other. For example, the geographic coordinates of content item 611 and content item 612 are located within the corresponding profile locations. On the other hand, the geographic coordinates of content item 613 and its profile location are spatially distant from each other and/or are not directed to a location that is common to each other. In this example, computer 110 may set the confidence level for profile-based geofeed result 610 to be 75% since 2 out of 3 non-profile-based geotagged content items are associated with geotag locations that match with and/or correspond to their associated profile locations.

FIG. 7 illustrates a screenshot of an interface 700 for communicating content within a geofeed based on geotag data confidence level selected by a user, according to an aspect of the invention.

Interface 700 may include a geofeed bounded by a geo-location defined by polygon 701. The geofeed may include content items 710-715 whose locations reside within the geo-location defined by polygon 501. Individual content items 710-715 may be associated with a geotag that has been obtained from one or more geotag sources. Geotag sources may include, for example, a GPS-enabled device (e.g., smartphone), a user input (e.g., the content creator manually inputting a geo-location when creating a social media post), a content provider (e.g., the content provider creating geotag data using various techniques as apparent to those of ordinary skill in the art), a user profile (e.g., the “home” location of the content creator), and/or location prediction module 113 (e.g., geotagged by crawling hyperlinks within content, automatic correlation, and/or other ways as discussed herein with respect to location prediction module 113). Computer 110 may determine a geotag confidence level for individual content items 710-715 based on the type of geotag source that provided the geotag for the particular content item and/or using any other techniques discussed herein.

Interface 700 may include an input field that may receive a threshold confidence level selected by a user. The input field may include a scrollbar, a text input box, and/or other input mechanisms. As illustrated, interface 700 may include scrollbar 720 having movable indicator 721 for indicating a threshold confidence level against which the geotag confidence level for individual content items 710-715 is compared.

Interface 700 may highlight or otherwise differentially display content with a confidence level that is higher than the threshold confidence level from content with a confidence level that is lower than the threshold confidence level. For example, a user may using movable indicator 721 indicate a threshold confidence level of 80%. Interface 700 may display content items 710, 712, 713 whose associated geotag confidence levels are above 80% in solid line and content items 711, 714, 715 whose associated geotag confidence levels are below 80% in dotted line. In another example, interface 700 may cause content items 711, 714, 715 to disappear since their associated geotag confidence levels are below the threshold confidence level set by the user.

Other embodiments, uses and advantages of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. The specification should be considered exemplary only, and the scope of the invention is accordingly intended to be limited only by the following claims. 

What is claimed is:
 1. A computer implemented method of determining a location where a social media content item was created when the content item lacks geotag data, wherein the content item is associated with a user and posted to a social media content provider, the method being implemented on a computer having one or more physical processors programmed with one or more computer program instructions that, when executed by the one or more physical processors, program the computer to perform the method, the method comprising: obtaining, by the computer, the content item from the social media content provider; obtaining, by the computer, metadata associated with the content item; determining, by the computer, that geotag data is missing from the metadata; and responsive to the determination that the geotag data is missing from the metadata, identifying, by a computer-implemented recognition technique performed on the content item by a location prediction component of the computer, the location at which the content item was created.
 2. The method of claim 1, wherein identifying the location comprises: recognizing, by the location prediction component, text in the content item; and associating, by the location prediction component, the recognized text with a recognized location, wherein the location is identified based on the recognized location.
 3. The method of claim 1, wherein the content item comprises audio information and identifying the location comprises: performing, by the location prediction component, speech recognition on the audio information to identify a recognized location associated with the content item, wherein the location is identified based on the recognized location.
 4. The method of claim 1, wherein identifying the location comprises: identifying, by the location prediction component, a hyperlink in the content item, wherein the hyperlink links to geotagged content whose metadata includes second location information that indicates where the geotagged content was created, wherein the location is based on the second location information.
 5. The method of claim 4, the method further comprising: identifying, by the location prediction component, a correlation between the content item and the geotagged content, wherein the location is based on the correlation.
 6. The method of claim 1, wherein identifying the location comprises: recognizing, by the location prediction component, an image in the content item; and associating, by the location prediction component, the recognized image with a recognized location, wherein the location is determined based on the recognized location.
 7. The method of claim 6, the method further comprising: identifying, by the location prediction component, a geographical feature in the recognized image from the content item; electronically accessing, by the location prediction component, a database configured to store geographic information indicating recognized locations associated with one or more geographical features; and obtaining, by the location prediction component, the recognized location from the database based on the identified geographical feature.
 8. The method of claim 7, wherein the geographical feature comprises one or more recognized buildings, signage, landmarks, and/or places.
 9. The method of claim 1, wherein identifying the location comprises: obtaining, by the location prediction component, a second content item associated with the user; obtaining, by the location prediction component, second metadata associated with the second content item; obtaining, by the location prediction component, second geotag data from the second metadata, the second geotag data comprising location information that indicates a location at which the second content item was created; obtaining, by the location prediction component, a difference between a first time at which the content item was posted and a second time at which the second content item was posted; and determining, by the location prediction component, whether the difference is less than or equal to a predetermined time, wherein the location is determined based on the location in which the second content item was created responsive to a determination that the difference is less than or equal to the predetermined time.
 10. The method of claim 1, wherein the location is automatically identified responsive to the determination that the geotag data is missing from the metadata.
 11. The method of claim 1, wherein identifying the location comprises: identifying, by the location prediction component, an indication of the location in the metadata associated with the content item that is missing geotag data.
 12. The method of claim 1, the method further comprising: determining, by the computer, a confidence level associated with the location, wherein the confidence level is indicative of a likelihood that the location accurately describes the location where the content was created.
 13. A system configured to determine a location where a social media content item was created when the content item lacks geotag data, wherein the content item is associated with a user and posted to a social media content provider, the system comprising: a computer having one or more physical processors programmed by one or more computer program instructions that, when executed by the one or more physical processors, program the computer to: obtain the content item from the social media content provider; obtain metadata associated with the content item; determine that geotag data is missing from the metadata; and responsive to the determination that the geotag data is missing from the metadata, identify, by a computer-implemented recognition technique performed on the content item by a location prediction component of the computer, the location at which the content item was created.
 14. The system of claim 13, wherein to identify the location, the location prediction component is configured to: recognize text in the content item; and associate the recognized text with a recognized location, wherein the location is identified based on the recognized location.
 15. The system of claim 13, wherein the content item comprises audio information, and to identify the location, the location prediction component is configured to: perform speech recognition on the audio information to identify a recognized location associated with the content item, wherein the location is identified based on the recognized location.
 16. The system of claim 13, wherein to identify the location, the location prediction component is configured to: identify a hyperlink in the content item, wherein the hyperlink links to geotagged content whose metadata includes second location information that indicates where the geotagged content was created, wherein the location is based on the second location information.
 17. The system of claim 16, wherein the location prediction component is configured to: identify a correlation between the content item and the geotagged content, wherein the location is based on the correlation.
 18. The system of claim 13, wherein to identify the location, the location prediction component is configured to: recognize an image in the content item; and associate the recognized image with a recognized location, wherein the location is determined based on the recognized location.
 19. The system of claim 18, wherein the location prediction component is configured to: identify a geographical feature in the recognized image from the content item; electronically access a database configured to store geographic information indicating recognized locations associated with one or more geographical features; and obtain the recognized location from the database based on the identified geographical feature.
 20. The system of claim 19, wherein the geographical feature comprises one or more recognized buildings, signage, landmarks, and/or places.
 21. The system of claim 13, wherein to identify the location, the location prediction component is configured to: obtain a second content item associated with the user; obtain second metadata associated with the second content item; obtain second geotag data from the second metadata, the second geotag data comprising location information that indicates a location at which the second content item was created; obtain a difference between a first time at which the content item was posted and a second time at which the second content item was posted; and determine whether the difference is less than or equal to a predetermined time, wherein the location is determined based on the location in which the second content item was created responsive to a determination that the difference is less than or equal to the predetermined time.
 22. The system of claim 13, wherein the location is automatically identified responsive to the determination that the geotag data is missing from the metadata.
 23. The system of claim 13, wherein to identify the location, the location prediction component is configured to: identify an indication of the location in the metadata associated with the content item that is missing geotag data.
 24. The system of claim 13, wherein the computer is further programmed to: determine a confidence level associated with the location, wherein the confidence level is indicative of a likelihood that the location accurately describes the location where the content was created. 